87 research outputs found

    Opening black box data mining models using sensitivity analysis

    Get PDF
    There are several supervised learning Data Mining (DM) methods, such as Neural Networks (NN), Support Vector Machines (SVM) and ensembles, that often attain high quality predictions, although the obtained models are difficult to inter- pret by humans. In this paper, we open these black box DM models by using a novel visualization approach that is based on a Sensitivity Analysis (SA) method. In particular, we propose a Global SA (GSA), which extends the applicability of previous SA methods (e.g. to classification tasks), and several visualization techniques (e.g. variable effect characteristic curve), for assessing input relevance and effects on the model’s responses. We show the GSA capabilities by conducting several experiments, using a NN ensemble and SVM model, in both synthetic and real-world datasets.(undefined

    Using sensitivity analysis and visualization techniques to open black box data mining models

    Get PDF
    In this paper, we propose a new visualization approach based on a Sen- sitivity Analysis (SA) to extract human understandable knowledge from su- pervised learning black box data mining models, such as Neural Networks (NN), Support Vector Machines (SVM) and ensembles, including Random Forests (RF). Five SA methods (three of which are purely new) and four mea- sures of input importance (one novel) are presented. Also, the SA approach is adapted to handle discrete variables and to aggregate multiple sensitivity responses. Moreover, several visualizations for the SA results are introduced, such as input pair importance color matrix and variable effect characteristic surface. A wide range of experiments was performed in order to test the SA methods and measures by fitting four well-known models (NN, SVM, RF and decision trees) to synthetic datasets (five regression and five classification tasks). In addition, the visualization capabilities of the SA are demonstrated using four real-world datasets (e.g., bank direct marketing and white wine quality).The work of P. Cortez was funded by FEDER, through the program COMPETE and the Portuguese Foundation for Science and Technology (FCT), within the project FCOMP-01-0124-FEDER-022674. Also, the au- thors wish to thank the anonymous reviewers for their helpful comments

    Helium-3 blankets for tritium breeding in fusion reactors

    Get PDF
    It is concluded that He-3 blankets offers considerable promise for tritium breeding in fusion reactors: good breeding potential, low operational risk, and attractive safety features. The availability of He-3 resources is the key issue for this concept. There is sufficient He-3 from decay of military stockpiles to meet the International Thermonuclear Experimental Reactor needs. Extraterrestrial sources of He-3 would be required for a fusion power economy

    Neural Networks for Text-to-Speech Phoneme Recognition

    Get PDF
    Abstract This paper presents two different artificial neural network approaches for phoneme recognition for text-to-speech applications: Staged Backpropagation Neural Networks and SelfOrganizing Maps. Several current commercial approaches rely on an exhaustive dictionary approach for text-to-phoneme conversion. Applying neural networks for phoneme mapping for text-to-speech conversion creates a fast distributed recognition engine. This engine not only supports the mapping of missing words on the database, but it can also mitigate contradictions related to different pronunciations for the same word. The ANNs presented in this work were trained based on the 2000 most common words in American English. Performance metrics for the 5000, 7000 and 10000 most common words in English were also estimated to test the robustness of these neural networks

    Modeling and simulation with operator scaling

    Get PDF
    Self-similar processes are useful in modeling diverse phenomena that exhibit scaling properties. Operator scaling allows a different scale factor in each coordinate. This paper develops practical methods for modeling and simulating stochastic processes with operator scaling. A simulation method for operator stable Levy processes is developed, based on a series representation, along with a Gaussian approximation of the small jumps. Several examples are given to illustrate practical applications. A classification of operator stable Levy processes in two dimensions is provided according to their exponents and symmetry groups. We conclude with some remarks and extensions to general operator self-similar processes.Comment: 29 pages, 13 figure

    Extreme value laws in dynamical systems under physical observables

    Get PDF
    Extreme value theory for chaotic dynamical systems is a rapidly expanding area of research. Given a system and a real function (observable) defined on its phase space, extreme value theory studies the limit probabilistic laws obeyed by large values attained by the observable along orbits of the system. Based on this theory, the so-called block maximum method is often used in applications for statistical prediction of large value occurrences. In this method, one performs inference for the parameters of the Generalised Extreme Value (GEV) distribution, using maxima over blocks of regularly sampled observations along an orbit of the system. The observables studied so far in the theory are expressed as functions of the distance with respect to a point, which is assumed to be a density point of the system's invariant measure. However, this is not the structure of the observables typically encountered in physical applications, such as windspeed or vorticity in atmospheric models. In this paper we consider extreme value limit laws for observables which are not functions of the distance from a density point of the dynamical system. In such cases, the limit laws are no longer determined by the functional form of the observable and the dimension of the invariant measure: they also depend on the specific geometry of the underlying attractor and of the observable's level sets. We present a collection of analytical and numerical results, starting with a toral hyperbolic automorphism as a simple template to illustrate the main ideas. We then formulate our main results for a uniformly hyperbolic system, the solenoid map. We also discuss non-uniformly hyperbolic examples of maps (H\'enon and Lozi maps) and of flows (the Lorenz63 and Lorenz84 models). Our purpose is to outline the main ideas and to highlight several serious problems found in the numerical estimation of the limit laws

    Scope for Credit Risk Diversification

    Get PDF
    This paper considers a simple model of credit risk and derives the limit distribution of losses under different assumptions regarding the structure of systematic risk and the nature of exposure or firm heterogeneity. We derive fat-tailed correlated loss distributions arising from Gaussian risk factors and explore the potential for risk diversification. Where possible the results are generalised to non-Gaussian distributions. The theoretical results indicate that if the firm parameters are heterogeneous but come from a common distribution, for sufficiently large portfolios there is no scope for further risk reduction through active portfolio management. However, if the firm parameters come from different distributions, then further risk reduction is possible by changing the portfolio weights. In either case, neglecting parameter heterogeneity can lead to underestimation of expected losses. But, once expected losses are controlled for, neglecting parameter heterogeneity can lead to overestimation of risk, whether measured by unexpected loss or value-at-risk

    Machines and Kernel Partial Least Squares. The hyperparameters

    No full text
    Abstract- We describe the use of machine learning for pattern recognition in magnetocardiography (MCG) that measures magnetic fields emitted by the electrophysiological activity of the heart. We used direct kernel methods to separate abnormal MCG heart patterns from normal ones. For unsupervised learning, w
    corecore